Overview

Dataset statistics

Number of variables38
Number of observations1393510
Missing cells6458063
Missing cells (%)12.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory404.0 MiB
Average record size in memory304.0 B

Variable types

Numeric7
Unsupported10
Categorical21

Alerts

USU_ESTADO has constant value "A"Constant
PAIS_RES has constant value "ECUADOR"Constant
INS_TIPO has constant value "I"Constant
CANTON_RESIDE has a high cardinality: 223 distinct valuesHigh cardinality
PARROQUIA_RESIDE has a high cardinality: 1134 distinct valuesHigh cardinality
ENC_FECHA_IMPRIME has a high cardinality: 181595 distinct valuesHigh cardinality
USU_NACIONALIDAD has a high cardinality: 66 distinct valuesHigh cardinality
Unnamed: 0 is highly overall correlated with COD_PROVINCIA_RESIDE and 2 other fieldsHigh correlation
INS_ID is highly overall correlated with INS_TIPO_INSCRIPCION and 2 other fieldsHigh correlation
INS_PASO is highly overall correlated with INS_ESTADO and 1 other fieldsHigh correlation
COD_PROVINCIA_RESIDE is highly overall correlated with Unnamed: 0 and 3 other fieldsHigh correlation
COD_CANTON_RESIDE is highly overall correlated with Unnamed: 0 and 3 other fieldsHigh correlation
COD_PARROQUIA_RESIDE is highly overall correlated with Unnamed: 0 and 3 other fieldsHigh correlation
INS_ESTADO is highly overall correlated with INS_PASO and 2 other fieldsHigh correlation
INS_TIPO_INSCRIPCION is highly overall correlated with INS_ID and 4 other fieldsHigh correlation
PROVINCIA_RESIDE is highly overall correlated with COD_PROVINCIA_RESIDE and 2 other fieldsHigh correlation
INS_AUTOIDENTIFICACION is highly overall correlated with archivo and 2 other fieldsHigh correlation
archivo is highly overall correlated with INS_ID and 3 other fieldsHigh correlation
PER_ID is highly overall correlated with INS_ID and 3 other fieldsHigh correlation
CARGA_ENCUESTA is highly overall correlated with INS_ESTADO and 4 other fieldsHigh correlation
FINALIZA_INSCRIPCION is highly overall correlated with INS_PASO and 2 other fieldsHigh correlation
OTRA_DISCAPACIDAD is highly overall correlated with INS_TIPO_INSCRIPCIONHigh correlation
TITULO_HOMOLOGADO is highly overall correlated with INS_TIPO_INSCRIPCIONHigh correlation
INTERNET_DOMICILIO is highly overall correlated with INS_TIPO_INSCRIPCION and 1 other fieldsHigh correlation
COMPUTADORA_DOMICILIO is highly overall correlated with INS_TIPO_INSCRIPCION and 1 other fieldsHigh correlation
USU_ESTADO_CIVIL is highly imbalanced (79.5%)Imbalance
INS_ESTADO is highly imbalanced (78.3%)Imbalance
INS_TIPO_INSCRIPCION is highly imbalanced (> 99.9%)Imbalance
ENC_FECHA_IMPRIME is highly imbalanced (57.7%)Imbalance
USU_NACIONALIDAD is highly imbalanced (97.0%)Imbalance
FINALIZA_INSCRIPCION is highly imbalanced (58.7%)Imbalance
OTRA_DISCAPACIDAD is highly imbalanced (99.6%)Imbalance
TITULO_HOMOLOGADO is highly imbalanced (98.0%)Imbalance
INTERNET_DOMICILIO is highly imbalanced (57.4%)Imbalance
INS_ID has 523300 (37.6%) missing valuesMissing
COD_PAIS_RESIDE has 25783 (1.9%) missing valuesMissing
PAIS_RES has 40125 (2.9%) missing valuesMissing
COD_PROVINCIA_RESIDE has 40125 (2.9%) missing valuesMissing
PROVINCIA_RESIDE has 40125 (2.9%) missing valuesMissing
COD_CANTON_RESIDE has 40125 (2.9%) missing valuesMissing
CANTON_RESIDE has 40125 (2.9%) missing valuesMissing
COD_PARROQUIA_RESIDE has 40125 (2.9%) missing valuesMissing
PARROQUIA_RESIDE has 40125 (2.9%) missing valuesMissing
INS_AUTOIDENTIFICACION has 45104 (3.2%) missing valuesMissing
ENC_ETERMINADA has 114116 (8.2%) missing valuesMissing
ENC_FECHA_UPLOAD_ENCUESTA has 341991 (24.5%) missing valuesMissing
ENC_FECHA_IMPRIME has 600996 (43.1%) missing valuesMissing
ENC_FECHA_FINALIZA_INSCRIPCION has 80957 (5.8%) missing valuesMissing
PER_ID has 267251 (19.2%) missing valuesMissing
USU_NACIONALIDAD has 267341 (19.2%) missing valuesMissing
USU_NACIONALIDAD_EXTRANJERA has 348841 (25.0%) missing valuesMissing
IAS_OTRA_DISCAPACIDAD has 318266 (22.8%) missing valuesMissing
INS_TIPO has 267251 (19.2%) missing valuesMissing
CARGA_ENCUESTA has 283543 (20.3%) missing valuesMissing
FINALIZA_INSCRIPCION has 267251 (19.2%) missing valuesMissing
OTRA_DISCAPACIDAD has 602388 (43.2%) missing valuesMissing
TITULO_HOMOLOGADO has 605682 (43.5%) missing valuesMissing
INTERNET_DOMICILIO has 605682 (43.5%) missing valuesMissing
COMPUTADORA_DOMICILIO has 605682 (43.5%) missing valuesMissing
USU_FECHAREGISTRO is an unsupported type, check if it needs cleaning or further analysisUnsupported
USU_FECHA_NAC is an unsupported type, check if it needs cleaning or further analysisUnsupported
INS_FECHA is an unsupported type, check if it needs cleaning or further analysisUnsupported
INS_FECHA_ACONDICIONES is an unsupported type, check if it needs cleaning or further analysisUnsupported
COD_PAIS_RESIDE is an unsupported type, check if it needs cleaning or further analysisUnsupported
ENC_ETERMINADA is an unsupported type, check if it needs cleaning or further analysisUnsupported
ENC_FECHA_UPLOAD_ENCUESTA is an unsupported type, check if it needs cleaning or further analysisUnsupported
ENC_FECHA_FINALIZA_INSCRIPCION is an unsupported type, check if it needs cleaning or further analysisUnsupported
USU_NACIONALIDAD_EXTRANJERA is an unsupported type, check if it needs cleaning or further analysisUnsupported
IAS_OTRA_DISCAPACIDAD is an unsupported type, check if it needs cleaning or further analysisUnsupported
INS_PASO has 28508 (2.0%) zerosZeros

Reproduction

Analysis started2023-03-10 07:48:46.570916
Analysis finished2023-03-10 07:50:05.958219
Duration1 minute and 19.39 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

Distinct312934
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean141704.46
Minimum1
Maximum312934
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 MiB
2023-03-10T02:50:05.998274image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13936
Q169676
median139351.5
Q3209027
95-th percentile280868.55
Maximum312934
Range312933
Interquartile range (IQR)139351

Descriptive statistics

Standard deviation84111.471
Coefficient of variation (CV)0.59356967
Kurtosis-1.0784133
Mean141704.46
Median Absolute Deviation (MAD)69675.5
Skewness0.11838041
Sum1.9746659 × 1011
Variance7.0747396 × 109
MonotonicityNot monotonic
2023-03-10T02:50:06.069327image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 5
 
< 0.1%
142724 5
 
< 0.1%
142648 5
 
< 0.1%
142649 5
 
< 0.1%
142650 5
 
< 0.1%
142651 5
 
< 0.1%
142652 5
 
< 0.1%
142653 5
 
< 0.1%
142654 5
 
< 0.1%
142655 5
 
< 0.1%
Other values (312924) 1393460
> 99.9%
ValueCountFrequency (%)
1 5
< 0.1%
2 5
< 0.1%
3 5
< 0.1%
4 5
< 0.1%
5 5
< 0.1%
6 5
< 0.1%
7 5
< 0.1%
8 5
< 0.1%
9 5
< 0.1%
10 5
< 0.1%
ValueCountFrequency (%)
312934 1
< 0.1%
312933 1
< 0.1%
312932 1
< 0.1%
312931 1
< 0.1%
312930 1
< 0.1%
312929 1
< 0.1%
312928 1
< 0.1%
312927 1
< 0.1%
312926 1
< 0.1%
312925 1
< 0.1%

USU_FECHAREGISTRO
Unsupported

REJECTED  UNSUPPORTED 

Missing98
Missing (%)< 0.1%
Memory size10.6 MiB

USU_ESTADO
Categorical

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.6 MiB
A
1393510 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1393510
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 1393510
100.0%

Length

2023-03-10T02:50:06.123707image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:06.168129image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
a 1393510
100.0%

Most occurring characters

ValueCountFrequency (%)
A 1393510
100.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1393510
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1393510
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1393510
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1393510
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1393510
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1393510
100.0%

USU_ESTADO_CIVIL
Categorical

Distinct6
Distinct (%)< 0.1%
Missing578
Missing (%)< 0.1%
Memory size10.6 MiB
S
1257561 
C
 
116265
D
 
15561
U
 
2339
V
 
1205

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1392932
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowS
2nd rowS
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 1257561
90.2%
C 116265
 
8.3%
D 15561
 
1.1%
U 2339
 
0.2%
V 1205
 
0.1%
- 1
 
< 0.1%
(Missing) 578
 
< 0.1%

Length

2023-03-10T02:50:06.205343image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:06.255218image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
s 1257561
90.3%
c 116265
 
8.3%
d 15561
 
1.1%
u 2339
 
0.2%
v 1205
 
0.1%
1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
S 1257561
90.3%
C 116265
 
8.3%
D 15561
 
1.1%
U 2339
 
0.2%
V 1205
 
0.1%
- 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1392931
> 99.9%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1257561
90.3%
C 116265
 
8.3%
D 15561
 
1.1%
U 2339
 
0.2%
V 1205
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1392931
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1257561
90.3%
C 116265
 
8.3%
D 15561
 
1.1%
U 2339
 
0.2%
V 1205
 
0.1%
Common
ValueCountFrequency (%)
- 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1392932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1257561
90.3%
C 116265
 
8.3%
D 15561
 
1.1%
U 2339
 
0.2%
V 1205
 
0.1%
- 1
 
< 0.1%

USU_FECHA_NAC
Unsupported

REJECTED  UNSUPPORTED 

Missing179
Missing (%)< 0.1%
Memory size10.6 MiB

INS_SEXO
Categorical

Distinct3
Distinct (%)< 0.1%
Missing4908
Missing (%)0.4%
Memory size10.6 MiB
MUJER
748840 
HOMBRE
636307 
SIN DATO
 
3455

Length

Max length8
Median length5
Mean length5.4657
Min length5

Characters and Unicode

Total characters7589682
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMUJER
2nd rowSIN DATO
3rd rowHOMBRE
4th rowMUJER
5th rowMUJER

Common Values

ValueCountFrequency (%)
MUJER 748840
53.7%
HOMBRE 636307
45.7%
SIN DATO 3455
 
0.2%
(Missing) 4908
 
0.4%

Length

2023-03-10T02:50:06.304256image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:06.355706image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
mujer 748840
53.8%
hombre 636307
45.7%
sin 3455
 
0.2%
dato 3455
 
0.2%

Most occurring characters

ValueCountFrequency (%)
M 1385147
18.3%
E 1385147
18.3%
R 1385147
18.3%
U 748840
9.9%
J 748840
9.9%
O 639762
8.4%
H 636307
8.4%
B 636307
8.4%
S 3455
 
< 0.1%
I 3455
 
< 0.1%
Other values (5) 17275
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7586227
> 99.9%
Space Separator 3455
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1385147
18.3%
E 1385147
18.3%
R 1385147
18.3%
U 748840
9.9%
J 748840
9.9%
O 639762
8.4%
H 636307
8.4%
B 636307
8.4%
S 3455
 
< 0.1%
I 3455
 
< 0.1%
Other values (4) 13820
 
0.2%
Space Separator
ValueCountFrequency (%)
3455
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7586227
> 99.9%
Common 3455
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1385147
18.3%
E 1385147
18.3%
R 1385147
18.3%
U 748840
9.9%
J 748840
9.9%
O 639762
8.4%
H 636307
8.4%
B 636307
8.4%
S 3455
 
< 0.1%
I 3455
 
< 0.1%
Other values (4) 13820
 
0.2%
Common
ValueCountFrequency (%)
3455
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7589682
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1385147
18.3%
E 1385147
18.3%
R 1385147
18.3%
U 748840
9.9%
J 748840
9.9%
O 639762
8.4%
H 636307
8.4%
B 636307
8.4%
S 3455
 
< 0.1%
I 3455
 
< 0.1%
Other values (5) 17275
 
0.2%

INS_ID
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct870210
Distinct (%)100.0%
Missing523300
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean9829289.1
Minimum7006528
Maximum11959428
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 MiB
2023-03-10T02:50:06.407530image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum7006528
5-th percentile7093550.9
Q17441633.5
median10329143
Q311524272
95-th percentile11872393
Maximum11959428
Range4952900
Interquartile range (IQR)4082638

Descriptive statistics

Standard deviation1801201.9
Coefficient of variation (CV)0.18324844
Kurtosis-1.3439195
Mean9829289.1
Median Absolute Deviation (MAD)1294532
Skewness-0.51296021
Sum8.5535457 × 1012
Variance3.2443284 × 1012
MonotonicityNot monotonic
2023-03-10T02:50:06.467690image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7433157 1
 
< 0.1%
10593386 1
 
< 0.1%
10514992 1
 
< 0.1%
9996407 1
 
< 0.1%
10073826 1
 
< 0.1%
10243339 1
 
< 0.1%
10527877 1
 
< 0.1%
10447134 1
 
< 0.1%
10021292 1
 
< 0.1%
10325767 1
 
< 0.1%
Other values (870200) 870200
62.4%
(Missing) 523300
37.6%
ValueCountFrequency (%)
7006528 1
< 0.1%
7006530 1
< 0.1%
7006532 1
< 0.1%
7006534 1
< 0.1%
7006536 1
< 0.1%
7006538 1
< 0.1%
7006540 1
< 0.1%
7006542 1
< 0.1%
7006544 1
< 0.1%
7006546 1
< 0.1%
ValueCountFrequency (%)
11959428 1
< 0.1%
11959421 1
< 0.1%
11959419 1
< 0.1%
11959417 1
< 0.1%
11959415 1
< 0.1%
11959413 1
< 0.1%
11959411 1
< 0.1%
11959409 1
< 0.1%
11959407 1
< 0.1%
11959405 1
< 0.1%

INS_FECHA
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size10.6 MiB

INS_ESTADO
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.6 MiB
TERMINADO
1345166 
INCOMPLETO
 
48344

Length

Max length10
Median length9
Mean length9.0346923
Min length9

Characters and Unicode

Total characters12589934
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTERMINADO
2nd rowINCOMPLETO
3rd rowTERMINADO
4th rowTERMINADO
5th rowTERMINADO

Common Values

ValueCountFrequency (%)
TERMINADO 1345166
96.5%
INCOMPLETO 48344
 
3.5%

Length

2023-03-10T02:50:06.520577image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:06.568256image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
terminado 1345166
96.5%
incompleto 48344
 
3.5%

Most occurring characters

ValueCountFrequency (%)
O 1441854
11.5%
T 1393510
11.1%
E 1393510
11.1%
M 1393510
11.1%
I 1393510
11.1%
N 1393510
11.1%
R 1345166
10.7%
A 1345166
10.7%
D 1345166
10.7%
C 48344
 
0.4%
Other values (2) 96688
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12589934
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 1441854
11.5%
T 1393510
11.1%
E 1393510
11.1%
M 1393510
11.1%
I 1393510
11.1%
N 1393510
11.1%
R 1345166
10.7%
A 1345166
10.7%
D 1345166
10.7%
C 48344
 
0.4%
Other values (2) 96688
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 12589934
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 1441854
11.5%
T 1393510
11.1%
E 1393510
11.1%
M 1393510
11.1%
I 1393510
11.1%
N 1393510
11.1%
R 1345166
10.7%
A 1345166
10.7%
D 1345166
10.7%
C 48344
 
0.4%
Other values (2) 96688
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12589934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 1441854
11.5%
T 1393510
11.1%
E 1393510
11.1%
M 1393510
11.1%
I 1393510
11.1%
N 1393510
11.1%
R 1345166
10.7%
A 1345166
10.7%
D 1345166
10.7%
C 48344
 
0.4%
Other values (2) 96688
 
0.8%

INS_PASO
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8476566
Minimum0
Maximum5
Zeros28508
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size10.6 MiB
2023-03-10T02:50:06.601292image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q15
median5
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.82491112
Coefficient of variation (CV)0.17016699
Kurtosis26.971244
Mean4.8476566
Median Absolute Deviation (MAD)0
Skewness-5.3462612
Sum6755258
Variance0.68047836
MonotonicityNot monotonic
2023-03-10T02:50:06.642134image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5 1345166
96.5%
0 28508
 
2.0%
1 16278
 
1.2%
4 2797
 
0.2%
3 440
 
< 0.1%
2 321
 
< 0.1%
ValueCountFrequency (%)
0 28508
 
2.0%
1 16278
 
1.2%
2 321
 
< 0.1%
3 440
 
< 0.1%
4 2797
 
0.2%
5 1345166
96.5%
ValueCountFrequency (%)
5 1345166
96.5%
4 2797
 
0.2%
3 440
 
< 0.1%
2 321
 
< 0.1%
1 16278
 
1.2%
0 28508
 
2.0%

INS_FECHA_ACONDICIONES
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size10.6 MiB

INS_TIPO_INSCRIPCION
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.6 MiB
1
1393509 
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1393510
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1393509
> 99.9%
2 1
 
< 0.1%

Length

2023-03-10T02:50:06.686789image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:06.733165image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1393509
> 99.9%
2 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 1393509
> 99.9%
2 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1393510
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1393509
> 99.9%
2 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1393510
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1393509
> 99.9%
2 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1393510
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1393509
> 99.9%
2 1
 
< 0.1%

COD_PAIS_RESIDE
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing25783
Missing (%)1.9%
Memory size10.6 MiB

PAIS_RES
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing40125
Missing (%)2.9%
Memory size10.6 MiB
ECUADOR
1353385 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters9473695
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowECUADOR
2nd rowECUADOR
3rd rowECUADOR
4th rowECUADOR
5th rowECUADOR

Common Values

ValueCountFrequency (%)
ECUADOR 1353385
97.1%
(Missing) 40125
 
2.9%

Length

2023-03-10T02:50:06.770593image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:06.818703image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
ecuador 1353385
100.0%

Most occurring characters

ValueCountFrequency (%)
E 1353385
14.3%
C 1353385
14.3%
U 1353385
14.3%
A 1353385
14.3%
D 1353385
14.3%
O 1353385
14.3%
R 1353385
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9473695
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1353385
14.3%
C 1353385
14.3%
U 1353385
14.3%
A 1353385
14.3%
D 1353385
14.3%
O 1353385
14.3%
R 1353385
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 9473695
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1353385
14.3%
C 1353385
14.3%
U 1353385
14.3%
A 1353385
14.3%
D 1353385
14.3%
O 1353385
14.3%
R 1353385
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9473695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1353385
14.3%
C 1353385
14.3%
U 1353385
14.3%
A 1353385
14.3%
D 1353385
14.3%
O 1353385
14.3%
R 1353385
14.3%

COD_PROVINCIA_RESIDE
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct25
Distinct (%)< 0.1%
Missing40125
Missing (%)2.9%
Infinite0
Infinite (%)0.0%
Mean11.832139
Minimum1
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 MiB
2023-03-10T02:50:06.857503image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median11
Q317
95-th percentile22
Maximum90
Range89
Interquartile range (IQR)8

Descriptive statistics

Standard deviation5.4755229
Coefficient of variation (CV)0.46276696
Kurtosis3.6915519
Mean11.832139
Median Absolute Deviation (MAD)4
Skewness0.42126605
Sum16013439
Variance29.981351
MonotonicityNot monotonic
2023-03-10T02:50:06.901897image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
9 346240
24.8%
17 287346
20.6%
13 117514
 
8.4%
7 64222
 
4.6%
1 62016
 
4.5%
12 60946
 
4.4%
11 55761
 
4.0%
18 46361
 
3.3%
10 39947
 
2.9%
5 37008
 
2.7%
Other values (15) 236024
16.9%
(Missing) 40125
 
2.9%
ValueCountFrequency (%)
1 62016
 
4.5%
2 18033
 
1.3%
3 13462
 
1.0%
4 12615
 
0.9%
5 37008
 
2.7%
6 35111
 
2.5%
7 64222
 
4.6%
8 30415
 
2.2%
9 346240
24.8%
10 39947
 
2.9%
ValueCountFrequency (%)
90 138
 
< 0.1%
24 27380
 
2.0%
23 34808
 
2.5%
22 9496
 
0.7%
21 13958
 
1.0%
20 1375
 
0.1%
19 10196
 
0.7%
18 46361
 
3.3%
17 287346
20.6%
16 8410
 
0.6%

PROVINCIA_RESIDE
Categorical

HIGH CORRELATION  MISSING 

Distinct25
Distinct (%)< 0.1%
Missing40125
Missing (%)2.9%
Memory size10.6 MiB
GUAYAS
346240 
PICHINCHA
287346 
MANABI
117514 
EL ORO
64222 
AZUAY
62016 
Other values (20)
476047 

Length

Max length30
Median length16
Mean length7.9367431
Min length4

Characters and Unicode

Total characters10741469
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCOTOPAXI
2nd rowPICHINCHA
3rd rowPICHINCHA
4th rowPICHINCHA
5th rowCAÑAR

Common Values

ValueCountFrequency (%)
GUAYAS 346240
24.8%
PICHINCHA 287346
20.6%
MANABI 117514
 
8.4%
EL ORO 64222
 
4.6%
AZUAY 62016
 
4.5%
LOS RIOS 60946
 
4.4%
LOJA 55761
 
4.0%
TUNGURAHUA 46361
 
3.3%
IMBABURA 39947
 
2.9%
COTOPAXI 37008
 
2.7%
Other values (15) 236024
16.9%
(Missing) 40125
 
2.9%

Length

2023-03-10T02:50:06.950700image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
guayas 346240
20.8%
pichincha 287346
17.3%
manabi 117514
 
7.1%
los 95754
 
5.8%
el 64222
 
3.9%
oro 64222
 
3.9%
azuay 62016
 
3.7%
rios 60946
 
3.7%
loja 55761
 
3.4%
tungurahua 46361
 
2.8%
Other values (25) 464082
27.9%

Most occurring characters

ValueCountFrequency (%)
A 2051506
19.1%
I 1008797
 
9.4%
C 754661
 
7.0%
S 742240
 
6.9%
H 723979
 
6.7%
O 681182
 
6.3%
N 625019
 
5.8%
U 615340
 
5.7%
G 438986
 
4.1%
Y 408256
 
3.8%
Other values (14) 2691503
25.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10430390
97.1%
Space Separator 311079
 
2.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2051506
19.7%
I 1008797
9.7%
C 754661
 
7.2%
S 742240
 
7.1%
H 723979
 
6.9%
O 681182
 
6.5%
N 625019
 
6.0%
U 615340
 
5.9%
G 438986
 
4.2%
Y 408256
 
3.9%
Other values (13) 2380424
22.8%
Space Separator
ValueCountFrequency (%)
311079
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10430390
97.1%
Common 311079
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2051506
19.7%
I 1008797
9.7%
C 754661
 
7.2%
S 742240
 
7.1%
H 723979
 
6.9%
O 681182
 
6.5%
N 625019
 
6.0%
U 615340
 
5.9%
G 438986
 
4.2%
Y 408256
 
3.9%
Other values (13) 2380424
22.8%
Common
ValueCountFrequency (%)
311079
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10728007
99.9%
None 13462
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2051506
19.1%
I 1008797
 
9.4%
C 754661
 
7.0%
S 742240
 
6.9%
H 723979
 
6.7%
O 681182
 
6.3%
N 625019
 
5.8%
U 615340
 
5.7%
G 438986
 
4.1%
Y 408256
 
3.8%
Other values (13) 2678041
25.0%
None
ValueCountFrequency (%)
Ñ 13462
100.0%

COD_CANTON_RESIDE
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct225
Distinct (%)< 0.1%
Missing40125
Missing (%)2.9%
Infinite0
Infinite (%)0.0%
Mean1186.4998
Minimum101
Maximum9008
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 MiB
2023-03-10T02:50:07.006584image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile201
Q1901
median1101
Q31701
95-th percentile2201
Maximum9008
Range8907
Interquartile range (IQR)800

Descriptive statistics

Standard deviation547.05169
Coefficient of variation (CV)0.46106345
Kurtosis3.711991
Mean1186.4998
Median Absolute Deviation (MAD)392
Skewness0.41708157
Sum1.605791 × 109
Variance299265.55
MonotonicityNot monotonic
2023-03-10T02:50:07.062622image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1701 244105
17.5%
901 242853
 
17.4%
101 50223
 
3.6%
1101 33793
 
2.4%
1301 33474
 
2.4%
1801 31233
 
2.2%
2301 31197
 
2.2%
701 29105
 
2.1%
907 25403
 
1.8%
1308 23953
 
1.7%
Other values (215) 608046
43.6%
(Missing) 40125
 
2.9%
ValueCountFrequency (%)
101 50223
3.6%
102 613
 
< 0.1%
103 2521
 
0.2%
104 625
 
< 0.1%
105 1724
 
0.1%
106 409
 
< 0.1%
107 220
 
< 0.1%
108 1185
 
0.1%
109 1346
 
0.1%
110 245
 
< 0.1%
ValueCountFrequency (%)
9008 1
 
< 0.1%
9007 7
 
< 0.1%
9004 126
 
< 0.1%
9001 4
 
< 0.1%
2403 5769
 
0.4%
2402 9701
 
0.7%
2401 11910
 
0.9%
2302 3611
 
0.3%
2301 31197
2.2%
2204 1190
 
0.1%

CANTON_RESIDE
Categorical

HIGH CARDINALITY  MISSING 

Distinct223
Distinct (%)< 0.1%
Missing40125
Missing (%)2.9%
Memory size10.6 MiB
DISTRITO METROPOLITANO DE QUITO
244105 
GUAYAQUIL
242853 
CUENCA
 
50223
LOJA
 
33793
PORTOVIEJO
 
33474
Other values (218)
748937 

Length

Max length32
Median length27
Mean length12.297359
Min length3

Characters and Unicode

Total characters16643061
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLATACUNGA
2nd rowDISTRITO METROPOLITANO DE QUITO
3rd rowDISTRITO METROPOLITANO DE QUITO
4th rowDISTRITO METROPOLITANO DE QUITO
5th rowLA TRONCAL

Common Values

ValueCountFrequency (%)
DISTRITO METROPOLITANO DE QUITO 244105
17.5%
GUAYAQUIL 242853
 
17.4%
CUENCA 50223
 
3.6%
LOJA 33793
 
2.4%
PORTOVIEJO 33474
 
2.4%
AMBATO 31233
 
2.2%
SANTO DOMINGO 31197
 
2.2%
MACHALA 29105
 
2.1%
DURAN 25403
 
1.8%
MANTA 23953
 
1.7%
Other values (213) 608046
43.6%
(Missing) 40125
 
2.9%

Length

2023-03-10T02:50:07.121176image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
de 254485
 
11.1%
quito 246224
 
10.8%
distrito 244105
 
10.7%
metropolitano 244105
 
10.7%
guayaquil 242853
 
10.6%
cuenca 50223
 
2.2%
loja 33793
 
1.5%
portoviejo 33474
 
1.5%
ambato 31233
 
1.4%
santo 31197
 
1.4%
Other values (254) 873495
38.2%

Most occurring characters

ValueCountFrequency (%)
A 2031440
12.2%
O 1864901
11.2%
I 1602844
 
9.6%
T 1519984
 
9.1%
U 1029420
 
6.2%
E 968051
 
5.8%
933145
 
5.6%
L 888836
 
5.3%
R 842551
 
5.1%
N 718482
 
4.3%
Other values (21) 4243407
25.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15703578
94.4%
Space Separator 933145
 
5.6%
Decimal Number 3298
 
< 0.1%
Open Punctuation 1520
 
< 0.1%
Close Punctuation 1520
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2031440
12.9%
O 1864901
11.9%
I 1602844
10.2%
T 1519984
9.7%
U 1029420
 
6.6%
E 968051
 
6.2%
L 888836
 
5.7%
R 842551
 
5.4%
N 718482
 
4.6%
D 677186
 
4.3%
Other values (16) 3559883
22.7%
Decimal Number
ValueCountFrequency (%)
2 1649
50.0%
4 1649
50.0%
Space Separator
ValueCountFrequency (%)
933145
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1520
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1520
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15703578
94.4%
Common 939483
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2031440
12.9%
O 1864901
11.9%
I 1602844
10.2%
T 1519984
9.7%
U 1029420
 
6.6%
E 968051
 
6.2%
L 888836
 
5.7%
R 842551
 
5.4%
N 718482
 
4.6%
D 677186
 
4.3%
Other values (16) 3559883
22.7%
Common
ValueCountFrequency (%)
933145
99.3%
2 1649
 
0.2%
4 1649
 
0.2%
( 1520
 
0.2%
) 1520
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16620920
99.9%
None 22141
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2031440
12.2%
O 1864901
11.2%
I 1602844
 
9.6%
T 1519984
 
9.1%
U 1029420
 
6.2%
E 968051
 
5.8%
933145
 
5.6%
L 888836
 
5.3%
R 842551
 
5.1%
N 718482
 
4.3%
Other values (20) 4221266
25.4%
None
ValueCountFrequency (%)
Ñ 22141
100.0%

COD_PARROQUIA_RESIDE
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1248
Distinct (%)0.1%
Missing40125
Missing (%)2.9%
Infinite0
Infinite (%)0.0%
Mean118674.34
Minimum10101
Maximum900851
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 MiB
2023-03-10T02:50:07.178173image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile20103
Q190112
median110103
Q3170112
95-th percentile220150
Maximum900851
Range890750
Interquartile range (IQR)80000

Descriptive statistics

Standard deviation54707.473
Coefficient of variation (CV)0.46098823
Kurtosis3.7113911
Mean118674.34
Median Absolute Deviation (MAD)39200
Skewness0.41734441
Sum1.6061207 × 1011
Variance2.9929076 × 109
MonotonicityNot monotonic
2023-03-10T02:50:07.234491image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90112 92863
 
6.7%
90114 57518
 
4.1%
90104 36759
 
2.6%
90115 25911
 
1.9%
170155 22402
 
1.6%
170108 20548
 
1.5%
170111 16648
 
1.2%
90701 15514
 
1.1%
70102 13061
 
0.9%
170106 12186
 
0.9%
Other values (1238) 1039975
74.6%
(Missing) 40125
 
2.9%
ValueCountFrequency (%)
10101 2750
0.2%
10102 912
 
0.1%
10103 2171
0.2%
10104 672
 
< 0.1%
10105 3330
0.2%
10106 298
 
< 0.1%
10107 1345
0.1%
10108 1402
0.1%
10109 2065
0.1%
10110 758
 
0.1%
ValueCountFrequency (%)
900851 1
 
< 0.1%
900751 7
 
< 0.1%
900451 126
 
< 0.1%
900151 4
 
< 0.1%
240352 2404
0.2%
240351 715
 
0.1%
240304 839
 
0.1%
240303 390
 
< 0.1%
240302 549
 
< 0.1%
240301 872
 
0.1%

PARROQUIA_RESIDE
Categorical

HIGH CARDINALITY  MISSING 

Distinct1134
Distinct (%)0.1%
Missing40125
Missing (%)2.9%
Memory size10.6 MiB
TARQUI
 
100396
XIMENA
 
57518
FEBRES CORDERO
 
36759
PASCUALES
 
25911
CALDERON (CARAPUNGO)
 
22402
Other values (1129)
1110399 

Length

Max length58
Median length46
Mean length11.286471
Min length4

Characters and Unicode

Total characters15274940
Distinct characters47
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowLA MATRIZ
2nd rowSAN JUAN
3rd rowCALDERON (CARAPUNGO)
4th rowBELISARIO QUEVEDO
5th rowLA TRONCAL

Common Values

ValueCountFrequency (%)
TARQUI 100396
 
7.2%
XIMENA 57518
 
4.1%
FEBRES CORDERO 36759
 
2.6%
PASCUALES 25911
 
1.9%
CALDERON (CARAPUNGO) 22402
 
1.6%
CHILLOGALLO 20548
 
1.5%
GUAMANÍ 16648
 
1.2%
ELOY ALFARO (DURÁN) 15514
 
1.1%
SUCRE 13176
 
0.9%
MACHALA 13061
 
0.9%
Other values (1124) 1031452
74.0%
(Missing) 40125
 
2.9%

Length

2023-03-10T02:50:07.399088image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
san 108969
 
4.7%
tarqui 100424
 
4.3%
de 85707
 
3.7%
la 79328
 
3.4%
el 64132
 
2.8%
ximena 57518
 
2.5%
cordero 38435
 
1.6%
febres 38037
 
1.6%
alfaro 27398
 
1.2%
santa 26932
 
1.2%
Other values (1288) 1703595
73.1%

Most occurring characters

ValueCountFrequency (%)
A 2177437
14.3%
E 1205590
 
7.9%
O 1204665
 
7.9%
1006924
 
6.6%
L 995988
 
6.5%
R 984707
 
6.4%
N 920013
 
6.0%
I 872151
 
5.7%
C 777445
 
5.1%
U 665941
 
4.4%
Other values (37) 4464079
29.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 13961035
91.4%
Space Separator 1006924
 
6.6%
Open Punctuation 135026
 
0.9%
Close Punctuation 135026
 
0.9%
Decimal Number 21710
 
0.1%
Other Punctuation 13602
 
0.1%
Dash Punctuation 1617
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2177437
15.6%
E 1205590
 
8.6%
O 1204665
 
8.6%
L 995988
 
7.1%
R 984707
 
7.1%
N 920013
 
6.6%
I 872151
 
6.2%
C 777445
 
5.6%
U 665941
 
4.8%
S 663608
 
4.8%
Other values (21) 3493490
25.0%
Decimal Number
ValueCountFrequency (%)
1 6976
32.1%
2 4692
21.6%
5 3566
16.4%
8 3018
13.9%
4 1919
 
8.8%
0 934
 
4.3%
7 424
 
2.0%
6 156
 
0.7%
9 25
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 13517
99.4%
: 85
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
1084
67.0%
- 533
33.0%
Space Separator
ValueCountFrequency (%)
1006924
100.0%
Open Punctuation
ValueCountFrequency (%)
( 135026
100.0%
Close Punctuation
ValueCountFrequency (%)
) 135026
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13961035
91.4%
Common 1313905
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2177437
15.6%
E 1205590
 
8.6%
O 1204665
 
8.6%
L 995988
 
7.1%
R 984707
 
7.1%
N 920013
 
6.6%
I 872151
 
6.2%
C 777445
 
5.6%
U 665941
 
4.8%
S 663608
 
4.8%
Other values (21) 3493490
25.0%
Common
ValueCountFrequency (%)
1006924
76.6%
( 135026
 
10.3%
) 135026
 
10.3%
. 13517
 
1.0%
1 6976
 
0.5%
2 4692
 
0.4%
5 3566
 
0.3%
8 3018
 
0.2%
4 1919
 
0.1%
1084
 
0.1%
Other values (6) 2157
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15061624
98.6%
None 212232
 
1.4%
Punctuation 1084
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2177437
14.5%
E 1205590
 
8.0%
O 1204665
 
8.0%
1006924
 
6.7%
L 995988
 
6.6%
R 984707
 
6.5%
N 920013
 
6.1%
I 872151
 
5.8%
C 777445
 
5.2%
U 665941
 
4.4%
Other values (31) 4250763
28.2%
None
ValueCountFrequency (%)
Í 68491
32.3%
Á 55845
26.3%
É 37200
17.5%
Ó 29888
14.1%
Ñ 20808
 
9.8%
Punctuation
ValueCountFrequency (%)
1084
100.0%

INS_AUTOIDENTIFICACION
Categorical

HIGH CORRELATION  MISSING 

Distinct16
Distinct (%)< 0.1%
Missing45104
Missing (%)3.2%
Memory size10.6 MiB
Mestizo/a
669372 
MESTIZO
451954 
MONTUBIO
 
36777
Indígena
 
35778
Montubio/a
 
34942
Other values (11)
119583 

Length

Max length36
Median length16
Mean length8.5943418
Min length4

Characters and Unicode

Total characters11588662
Distinct characters37
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMESTIZO
2nd rowMULATO
3rd rowMESTIZO
4th rowOTRO
5th rowMESTIZO

Common Values

ValueCountFrequency (%)
Mestizo/a 669372
48.0%
MESTIZO 451954
32.4%
MONTUBIO 36777
 
2.6%
Indígena 35778
 
2.6%
Montubio/a 34942
 
2.5%
INDÍGENA 30274
 
2.2%
Blanco/a 17294
 
1.2%
Afroecuatoriano/a o Afrodescendiente 16340
 
1.2%
AFRODESCENDIENTE 15557
 
1.1%
BLANCO 11247
 
0.8%
Other values (6) 28871
 
2.1%
(Missing) 45104
 
3.2%

Length

2023-03-10T02:50:07.452680image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mestizo/a 669372
48.5%
mestizo 451954
32.7%
indígena 66052
 
4.8%
montubio 36777
 
2.7%
montubio/a 34942
 
2.5%
afrodescendiente 31897
 
2.3%
blanco/a 17294
 
1.3%
afroecuatoriano/a 16340
 
1.2%
o 16340
 
1.2%
blanco 11247
 
0.8%
Other values (6) 28871
 
2.1%

Most occurring characters

ValueCountFrequency (%)
M 1208180
 
10.4%
o 853245
 
7.4%
a 846002
 
7.3%
e 792588
 
6.8%
/ 752943
 
6.5%
t 746251
 
6.4%
i 736994
 
6.4%
s 685712
 
5.9%
z 669372
 
5.8%
I 570340
 
4.9%
Other values (27) 3727035
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5906222
51.0%
Uppercase Letter 4896817
42.3%
Other Punctuation 752943
 
6.5%
Space Separator 32680
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1208180
24.7%
I 570340
11.6%
O 570070
11.6%
E 548572
11.2%
T 514048
10.5%
S 467511
 
9.5%
Z 451954
 
9.2%
N 149540
 
3.1%
A 97586
 
2.0%
B 65318
 
1.3%
Other values (8) 253698
 
5.2%
Lowercase Letter
ValueCountFrequency (%)
o 853245
14.4%
a 846002
14.3%
e 792588
13.4%
t 746251
12.6%
i 736994
12.5%
s 685712
11.6%
z 669372
11.3%
n 172812
 
2.9%
d 68458
 
1.2%
u 58589
 
1.0%
Other values (7) 276199
 
4.7%
Other Punctuation
ValueCountFrequency (%)
/ 752943
100.0%
Space Separator
ValueCountFrequency (%)
32680
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10803039
93.2%
Common 785623
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1208180
 
11.2%
o 853245
 
7.9%
a 846002
 
7.8%
e 792588
 
7.3%
t 746251
 
6.9%
i 736994
 
6.8%
s 685712
 
6.3%
z 669372
 
6.2%
I 570340
 
5.3%
O 570070
 
5.3%
Other values (25) 3124285
28.9%
Common
ValueCountFrequency (%)
/ 752943
95.8%
32680
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11522610
99.4%
None 66052
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1208180
 
10.5%
o 853245
 
7.4%
a 846002
 
7.3%
e 792588
 
6.9%
/ 752943
 
6.5%
t 746251
 
6.5%
i 736994
 
6.4%
s 685712
 
6.0%
z 669372
 
5.8%
I 570340
 
4.9%
Other values (25) 3660983
31.8%
None
ValueCountFrequency (%)
í 35778
54.2%
Í 30274
45.8%

ENC_ETERMINADA
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing114116
Missing (%)8.2%
Memory size10.6 MiB

ENC_FECHA_UPLOAD_ENCUESTA
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing341991
Missing (%)24.5%
Memory size10.6 MiB

ENC_FECHA_IMPRIME
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct181595
Distinct (%)22.9%
Missing600996
Missing (%)43.1%
Memory size10.6 MiB
#NULL!
312934 
06/12/2019
37378 
10/12/2019
 
32458
05/12/2019
 
31400
11/12/2019
 
29842
Other values (181590)
348502 

Length

Max length18
Median length17
Mean length10.37361
Min length6

Characters and Unicode

Total characters8221231
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique155455 ?
Unique (%)19.6%

Sample

1st row7/5/2019 19:58:59
2nd row2/5/2019 13:56:26
3rd row4/5/2019 8:17:16
4th row1/5/2019 0:23:42
5th row1/5/2019 9:01:30

Common Values

ValueCountFrequency (%)
#NULL! 312934
22.5%
06/12/2019 37378
 
2.7%
10/12/2019 32458
 
2.3%
05/12/2019 31400
 
2.3%
11/12/2019 29842
 
2.1%
09/12/2019 29616
 
2.1%
12/12/2019 24631
 
1.8%
07/12/2019 22554
 
1.6%
08/12/2019 17218
 
1.2%
04/12/2019 13685
 
1.0%
Other values (181585) 240798
17.3%
(Missing) 600996
43.1%

Length

2023-03-10T02:50:07.506202image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
null 312934
31.1%
06/12/2019 37378
 
3.7%
10/12/2019 32458
 
3.2%
05/12/2019 31400
 
3.1%
11/12/2019 29842
 
3.0%
09/12/2019 29616
 
2.9%
30/4/2019 27040
 
2.7%
12/12/2019 24631
 
2.5%
6/5/2019 23388
 
2.3%
07/12/2019 22554
 
2.2%
Other values (70616) 433861
43.2%

Most occurring characters

ValueCountFrequency (%)
1 1156134
14.1%
2 1065994
13.0%
/ 959160
11.7%
0 853816
10.4%
L 625868
 
7.6%
9 598498
 
7.3%
: 425176
 
5.2%
! 312934
 
3.8%
N 312934
 
3.8%
U 312934
 
3.8%
Other values (8) 1597783
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4746703
57.7%
Other Punctuation 2010204
24.5%
Uppercase Letter 1251736
 
15.2%
Space Separator 212588
 
2.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1156134
24.4%
2 1065994
22.5%
0 853816
18.0%
9 598498
12.6%
5 287969
 
6.1%
4 253512
 
5.3%
3 190761
 
4.0%
6 133597
 
2.8%
7 112789
 
2.4%
8 93633
 
2.0%
Other Punctuation
ValueCountFrequency (%)
/ 959160
47.7%
: 425176
21.2%
! 312934
 
15.6%
# 312934
 
15.6%
Uppercase Letter
ValueCountFrequency (%)
L 625868
50.0%
N 312934
25.0%
U 312934
25.0%
Space Separator
ValueCountFrequency (%)
212588
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6969495
84.8%
Latin 1251736
 
15.2%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1156134
16.6%
2 1065994
15.3%
/ 959160
13.8%
0 853816
12.3%
9 598498
8.6%
: 425176
 
6.1%
! 312934
 
4.5%
# 312934
 
4.5%
5 287969
 
4.1%
4 253512
 
3.6%
Other values (5) 743368
10.7%
Latin
ValueCountFrequency (%)
L 625868
50.0%
N 312934
25.0%
U 312934
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8221231
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1156134
14.1%
2 1065994
13.0%
/ 959160
11.7%
0 853816
10.4%
L 625868
 
7.6%
9 598498
 
7.3%
: 425176
 
5.2%
! 312934
 
3.8%
N 312934
 
3.8%
U 312934
 
3.8%
Other values (8) 1597783
19.4%

ENC_FECHA_FINALIZA_INSCRIPCION
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing80957
Missing (%)5.8%
Memory size10.6 MiB

cod_final
Real number (ℝ)

Distinct672657
Distinct (%)48.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6082538 × 109
Minimum1.0000009 × 109
Maximum9.9998209 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 MiB
2023-03-10T02:50:07.562203image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000009 × 109
5-th percentile1.2477608 × 109
Q12.0596414 × 109
median2.352071 × 109
Q32.5957819 × 109
95-th percentile6.4448299 × 109
Maximum9.9998209 × 109
Range8.99982 × 109
Interquartile range (IQR)5.3614049 × 108

Descriptive statistics

Standard deviation1.492286 × 109
Coefficient of variation (CV)0.57213989
Kurtosis10.114951
Mean2.6082538 × 109
Median Absolute Deviation (MAD)2.5924017 × 108
Skewness3.1337488
Sum3.6346277 × 1015
Variance2.2269176 × 1018
MonotonicityNot monotonic
2023-03-10T02:50:07.617773image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2154521774 25
 
< 0.1%
2146051701 23
 
< 0.1%
2465590901 22
 
< 0.1%
2432720992 21
 
< 0.1%
2174251774 20
 
< 0.1%
2147341756 20
 
< 0.1%
2186391738 20
 
< 0.1%
2154501701 20
 
< 0.1%
2165561738 20
 
< 0.1%
2149491792 20
 
< 0.1%
Other values (672647) 1393299
> 99.9%
ValueCountFrequency (%)
1000000901 1
 
< 0.1%
1000001310 1
 
< 0.1%
1000011238 1
 
< 0.1%
1000011729 1
 
< 0.1%
1000011965 1
 
< 0.1%
1000030892 3
< 0.1%
1000032110 2
< 0.1%
1000041365 1
 
< 0.1%
1000050374 1
 
< 0.1%
1000051774 2
< 0.1%
ValueCountFrequency (%)
9999820865 1
 
< 0.1%
9999700965 2
< 0.1%
9999671765 1
 
< 0.1%
9999661356 1
 
< 0.1%
9999630929 1
 
< 0.1%
9999531301 1
 
< 0.1%
9999501765 3
< 0.1%
9999471301 1
 
< 0.1%
9999440129 2
< 0.1%
9999411756 1
 
< 0.1%

archivo
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.6 MiB
insritos_per21.csv
312934 
insritos_per19.csv
309321 
insritos_per22.csv
290025 
insritos_per18.csv
267251 
insritos_per20.csv
213979 

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters25083180
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowinsritos_per18.csv
2nd rowinsritos_per18.csv
3rd rowinsritos_per18.csv
4th rowinsritos_per18.csv
5th rowinsritos_per18.csv

Common Values

ValueCountFrequency (%)
insritos_per21.csv 312934
22.5%
insritos_per19.csv 309321
22.2%
insritos_per22.csv 290025
20.8%
insritos_per18.csv 267251
19.2%
insritos_per20.csv 213979
15.4%

Length

2023-03-10T02:50:07.667270image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:07.719689image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
insritos_per21.csv 312934
22.5%
insritos_per19.csv 309321
22.2%
insritos_per22.csv 290025
20.8%
insritos_per18.csv 267251
19.2%
insritos_per20.csv 213979
15.4%

Most occurring characters

ValueCountFrequency (%)
s 4180530
16.7%
i 2787020
11.1%
r 2787020
11.1%
n 1393510
 
5.6%
v 1393510
 
5.6%
c 1393510
 
5.6%
. 1393510
 
5.6%
e 1393510
 
5.6%
p 1393510
 
5.6%
_ 1393510
 
5.6%
Other values (7) 5574040
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19509140
77.8%
Decimal Number 2787020
 
11.1%
Other Punctuation 1393510
 
5.6%
Connector Punctuation 1393510
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 4180530
21.4%
i 2787020
14.3%
r 2787020
14.3%
n 1393510
 
7.1%
v 1393510
 
7.1%
c 1393510
 
7.1%
e 1393510
 
7.1%
p 1393510
 
7.1%
o 1393510
 
7.1%
t 1393510
 
7.1%
Decimal Number
ValueCountFrequency (%)
2 1106963
39.7%
1 889506
31.9%
9 309321
 
11.1%
8 267251
 
9.6%
0 213979
 
7.7%
Other Punctuation
ValueCountFrequency (%)
. 1393510
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1393510
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19509140
77.8%
Common 5574040
 
22.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 4180530
21.4%
i 2787020
14.3%
r 2787020
14.3%
n 1393510
 
7.1%
v 1393510
 
7.1%
c 1393510
 
7.1%
e 1393510
 
7.1%
p 1393510
 
7.1%
o 1393510
 
7.1%
t 1393510
 
7.1%
Common
ValueCountFrequency (%)
. 1393510
25.0%
_ 1393510
25.0%
2 1106963
19.9%
1 889506
16.0%
9 309321
 
5.5%
8 267251
 
4.8%
0 213979
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25083180
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 4180530
16.7%
i 2787020
11.1%
r 2787020
11.1%
n 1393510
 
5.6%
v 1393510
 
5.6%
c 1393510
 
5.6%
. 1393510
 
5.6%
e 1393510
 
5.6%
p 1393510
 
5.6%
_ 1393510
 
5.6%
Other values (7) 5574040
22.2%

PER_ID
Categorical

HIGH CORRELATION  MISSING 

Distinct4
Distinct (%)< 0.1%
Missing267251
Missing (%)19.2%
Memory size10.6 MiB
21.0
312934 
19.0
309321 
22.0
290025 
20.0
213979 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4505036
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row19.0
2nd row19.0
3rd row19.0
4th row19.0
5th row19.0

Common Values

ValueCountFrequency (%)
21.0 312934
22.5%
19.0 309321
22.2%
22.0 290025
20.8%
20.0 213979
15.4%
(Missing) 267251
19.2%

Length

2023-03-10T02:50:07.769524image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:07.817493image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
21.0 312934
27.8%
19.0 309321
27.5%
22.0 290025
25.8%
20.0 213979
19.0%

Most occurring characters

ValueCountFrequency (%)
0 1340238
29.7%
. 1126259
25.0%
2 1106963
24.6%
1 622255
13.8%
9 309321
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3378777
75.0%
Other Punctuation 1126259
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1340238
39.7%
2 1106963
32.8%
1 622255
18.4%
9 309321
 
9.2%
Other Punctuation
ValueCountFrequency (%)
. 1126259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4505036
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1340238
29.7%
. 1126259
25.0%
2 1106963
24.6%
1 622255
13.8%
9 309321
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4505036
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1340238
29.7%
. 1126259
25.0%
2 1106963
24.6%
1 622255
13.8%
9 309321
 
6.9%

USU_NACIONALIDAD
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct66
Distinct (%)< 0.1%
Missing267341
Missing (%)19.2%
Memory size10.6 MiB
ECUATORIANA
1100505 
ECUADOR
 
21674
COLOMBIANA
 
1478
VENEZOLANA
 
1338
CUBANA
 
326
Other values (61)
 
848

Length

Max length24
Median length11
Mean length10.919602
Min length2

Characters and Unicode

Total characters12297317
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)< 0.1%

Sample

1st rowECUATORIANA
2nd rowECUATORIANA
3rd rowECUATORIANA
4th rowECUATORIANA
5th rowECUADOR

Common Values

ValueCountFrequency (%)
ECUATORIANA 1100505
79.0%
ECUADOR 21674
 
1.6%
COLOMBIANA 1478
 
0.1%
VENEZOLANA 1338
 
0.1%
CUBANA 326
 
< 0.1%
PERUANA 183
 
< 0.1%
ECUATORIANA / ESPAÑOLA 174
 
< 0.1%
COLOMBIA 71
 
< 0.1%
VENEZUELA 65
 
< 0.1%
CHILENA 29
 
< 0.1%
Other values (56) 326
 
< 0.1%
(Missing) 267341
 
19.2%

Length

2023-03-10T02:50:07.865501image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ecuatoriana 1100726
97.7%
ecuador 21674
 
1.9%
colombiana 1483
 
0.1%
venezolana 1347
 
0.1%
cubana 326
 
< 0.1%
221
 
< 0.1%
española 195
 
< 0.1%
peruana 183
 
< 0.1%
colombia 71
 
< 0.1%
venezuela 65
 
< 0.1%
Other values (49) 322
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A 3331624
27.1%
O 1127147
 
9.2%
E 1125900
 
9.2%
C 1124397
 
9.1%
U 1123064
 
9.1%
R 1122733
 
9.1%
N 1105766
 
9.0%
I 1102621
 
9.0%
T 1100830
 
9.0%
D 21750
 
0.2%
Other values (21) 11485
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12296574
> 99.9%
Space Separator 444
 
< 0.1%
Other Punctuation 249
 
< 0.1%
Decimal Number 50
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 3331624
27.1%
O 1127147
 
9.2%
E 1125900
 
9.2%
C 1124397
 
9.1%
U 1123064
 
9.1%
R 1122733
 
9.1%
N 1105766
 
9.0%
I 1102621
 
9.0%
T 1100830
 
9.0%
D 21750
 
0.2%
Other values (15) 10742
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 17
34.0%
0 17
34.0%
7 16
32.0%
Other Punctuation
ValueCountFrequency (%)
/ 235
94.4%
. 14
 
5.6%
Space Separator
ValueCountFrequency (%)
444
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12296574
> 99.9%
Common 743
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 3331624
27.1%
O 1127147
 
9.2%
E 1125900
 
9.2%
C 1124397
 
9.1%
U 1123064
 
9.1%
R 1122733
 
9.1%
N 1105766
 
9.0%
I 1102621
 
9.0%
T 1100830
 
9.0%
D 21750
 
0.2%
Other values (15) 10742
 
0.1%
Common
ValueCountFrequency (%)
444
59.8%
/ 235
31.6%
1 17
 
2.3%
0 17
 
2.3%
7 16
 
2.2%
. 14
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12297082
> 99.9%
None 235
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 3331624
27.1%
O 1127147
 
9.2%
E 1125900
 
9.2%
C 1124397
 
9.1%
U 1123064
 
9.1%
R 1122733
 
9.1%
N 1105766
 
9.0%
I 1102621
 
9.0%
T 1100830
 
9.0%
D 21750
 
0.2%
Other values (20) 11250
 
0.1%
None
ValueCountFrequency (%)
Ñ 235
100.0%

USU_NACIONALIDAD_EXTRANJERA
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing348841
Missing (%)25.0%
Memory size10.6 MiB

IAS_OTRA_DISCAPACIDAD
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing318266
Missing (%)22.8%
Memory size10.6 MiB

INS_TIPO
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing267251
Missing (%)19.2%
Memory size10.6 MiB
I
1126259 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1126259
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI
2nd rowI
3rd rowI
4th rowI
5th rowI

Common Values

ValueCountFrequency (%)
I 1126259
80.8%
(Missing) 267251
 
19.2%

Length

2023-03-10T02:50:07.910439image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:07.951028image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
i 1126259
100.0%

Most occurring characters

ValueCountFrequency (%)
I 1126259
100.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1126259
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 1126259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1126259
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1126259
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1126259
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1126259
100.0%

CARGA_ENCUESTA
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing283543
Missing (%)20.3%
Memory size10.6 MiB
SI
739494 
EFA_CARGADA
293029 
NO
77444 

Length

Max length11
Median length2
Mean length4.3759814
Min length2

Characters and Unicode

Total characters4857195
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEFA_CARGADA
2nd rowEFA_CARGADA
3rd rowEFA_CARGADA
4th rowEFA_CARGADA
5th rowEFA_CARGADA

Common Values

ValueCountFrequency (%)
SI 739494
53.1%
EFA_CARGADA 293029
 
21.0%
NO 77444
 
5.6%
(Missing) 283543
 
20.3%

Length

2023-03-10T02:50:07.984775image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:08.028725image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
si 739494
66.6%
efa_cargada 293029
 
26.4%
no 77444
 
7.0%

Most occurring characters

ValueCountFrequency (%)
A 1172116
24.1%
S 739494
15.2%
I 739494
15.2%
E 293029
 
6.0%
F 293029
 
6.0%
_ 293029
 
6.0%
C 293029
 
6.0%
R 293029
 
6.0%
G 293029
 
6.0%
D 293029
 
6.0%
Other values (2) 154888
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4564166
94.0%
Connector Punctuation 293029
 
6.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1172116
25.7%
S 739494
16.2%
I 739494
16.2%
E 293029
 
6.4%
F 293029
 
6.4%
C 293029
 
6.4%
R 293029
 
6.4%
G 293029
 
6.4%
D 293029
 
6.4%
N 77444
 
1.7%
Connector Punctuation
ValueCountFrequency (%)
_ 293029
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4564166
94.0%
Common 293029
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1172116
25.7%
S 739494
16.2%
I 739494
16.2%
E 293029
 
6.4%
F 293029
 
6.4%
C 293029
 
6.4%
R 293029
 
6.4%
G 293029
 
6.4%
D 293029
 
6.4%
N 77444
 
1.7%
Common
ValueCountFrequency (%)
_ 293029
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4857195
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1172116
24.1%
S 739494
15.2%
I 739494
15.2%
E 293029
 
6.0%
F 293029
 
6.0%
_ 293029
 
6.0%
C 293029
 
6.0%
R 293029
 
6.0%
G 293029
 
6.0%
D 293029
 
6.0%
Other values (2) 154888
 
3.2%

FINALIZA_INSCRIPCION
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing267251
Missing (%)19.2%
Memory size10.6 MiB
EFA Y REGISTRO COMPLETO
1032523 
EFA Y REGISTRO INCOMPLETO
 
93736

Length

Max length25
Median length23
Mean length23.166455
Min length23

Characters and Unicode

Total characters26091429
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEFA Y REGISTRO COMPLETO
2nd rowEFA Y REGISTRO COMPLETO
3rd rowEFA Y REGISTRO COMPLETO
4th rowEFA Y REGISTRO COMPLETO
5th rowEFA Y REGISTRO COMPLETO

Common Values

ValueCountFrequency (%)
EFA Y REGISTRO COMPLETO 1032523
74.1%
EFA Y REGISTRO INCOMPLETO 93736
 
6.7%
(Missing) 267251
 
19.2%

Length

2023-03-10T02:50:08.072511image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:08.123495image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
efa 1126259
25.0%
y 1126259
25.0%
registro 1126259
25.0%
completo 1032523
22.9%
incompleto 93736
 
2.1%

Most occurring characters

ValueCountFrequency (%)
E 3378777
12.9%
3378777
12.9%
O 3378777
12.9%
R 2252518
 
8.6%
T 2252518
 
8.6%
I 1219995
 
4.7%
F 1126259
 
4.3%
A 1126259
 
4.3%
Y 1126259
 
4.3%
G 1126259
 
4.3%
Other values (6) 5725031
21.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 22712652
87.1%
Space Separator 3378777
 
12.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3378777
14.9%
O 3378777
14.9%
R 2252518
9.9%
T 2252518
9.9%
I 1219995
 
5.4%
F 1126259
 
5.0%
A 1126259
 
5.0%
Y 1126259
 
5.0%
G 1126259
 
5.0%
S 1126259
 
5.0%
Other values (5) 4598772
20.2%
Space Separator
ValueCountFrequency (%)
3378777
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22712652
87.1%
Common 3378777
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3378777
14.9%
O 3378777
14.9%
R 2252518
9.9%
T 2252518
9.9%
I 1219995
 
5.4%
F 1126259
 
5.0%
A 1126259
 
5.0%
Y 1126259
 
5.0%
G 1126259
 
5.0%
S 1126259
 
5.0%
Other values (5) 4598772
20.2%
Common
ValueCountFrequency (%)
3378777
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26091429
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 3378777
12.9%
3378777
12.9%
O 3378777
12.9%
R 2252518
 
8.6%
T 2252518
 
8.6%
I 1219995
 
4.7%
F 1126259
 
4.3%
A 1126259
 
4.3%
Y 1126259
 
4.3%
G 1126259
 
4.3%
Other values (6) 5725031
21.9%

OTRA_DISCAPACIDAD
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing602388
Missing (%)43.2%
Memory size10.6 MiB
NO
790870 
SI
 
252

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1582244
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 790870
56.8%
SI 252
 
< 0.1%
(Missing) 602388
43.2%

Length

2023-03-10T02:50:08.162450image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:08.207262image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
no 790870
> 99.9%
si 252
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 790870
50.0%
O 790870
50.0%
S 252
 
< 0.1%
I 252
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1582244
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 790870
50.0%
O 790870
50.0%
S 252
 
< 0.1%
I 252
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1582244
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 790870
50.0%
O 790870
50.0%
S 252
 
< 0.1%
I 252
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1582244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 790870
50.0%
O 790870
50.0%
S 252
 
< 0.1%
I 252
 
< 0.1%

TITULO_HOMOLOGADO
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing605682
Missing (%)43.5%
Memory size10.6 MiB
NO
786315 
SI
 
1513

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1575656
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 786315
56.4%
SI 1513
 
0.1%
(Missing) 605682
43.5%

Length

2023-03-10T02:50:08.243049image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:08.287654image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
no 786315
99.8%
si 1513
 
0.2%

Most occurring characters

ValueCountFrequency (%)
N 786315
49.9%
O 786315
49.9%
S 1513
 
0.1%
I 1513
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1575656
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 786315
49.9%
O 786315
49.9%
S 1513
 
0.1%
I 1513
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1575656
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 786315
49.9%
O 786315
49.9%
S 1513
 
0.1%
I 1513
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1575656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 786315
49.9%
O 786315
49.9%
S 1513
 
0.1%
I 1513
 
0.1%

INTERNET_DOMICILIO
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing605682
Missing (%)43.5%
Memory size10.6 MiB
SI
719317 
NO
 
68511

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1575656
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowSI
3rd rowSI
4th rowSI
5th rowSI

Common Values

ValueCountFrequency (%)
SI 719317
51.6%
NO 68511
 
4.9%
(Missing) 605682
43.5%

Length

2023-03-10T02:50:08.323916image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:08.368599image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
si 719317
91.3%
no 68511
 
8.7%

Most occurring characters

ValueCountFrequency (%)
S 719317
45.7%
I 719317
45.7%
N 68511
 
4.3%
O 68511
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1575656
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 719317
45.7%
I 719317
45.7%
N 68511
 
4.3%
O 68511
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1575656
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 719317
45.7%
I 719317
45.7%
N 68511
 
4.3%
O 68511
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1575656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 719317
45.7%
I 719317
45.7%
N 68511
 
4.3%
O 68511
 
4.3%

COMPUTADORA_DOMICILIO
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing605682
Missing (%)43.5%
Memory size10.6 MiB
SI
645393 
NO
142435 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1575656
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowSI
3rd rowSI
4th rowNO
5th rowSI

Common Values

ValueCountFrequency (%)
SI 645393
46.3%
NO 142435
 
10.2%
(Missing) 605682
43.5%

Length

2023-03-10T02:50:08.405091image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-10T02:50:08.449593image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
si 645393
81.9%
no 142435
 
18.1%

Most occurring characters

ValueCountFrequency (%)
S 645393
41.0%
I 645393
41.0%
N 142435
 
9.0%
O 142435
 
9.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1575656
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 645393
41.0%
I 645393
41.0%
N 142435
 
9.0%
O 142435
 
9.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1575656
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 645393
41.0%
I 645393
41.0%
N 142435
 
9.0%
O 142435
 
9.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1575656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 645393
41.0%
I 645393
41.0%
N 142435
 
9.0%
O 142435
 
9.0%

Interactions

2023-03-10T02:49:52.661477image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:47.573802image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.438608image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.175260image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.042132image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.897876image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:51.768910image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:52.780414image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:47.676334image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.543603image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.274621image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.140028image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.997972image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:51.868493image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:52.917406image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:47.805609image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.643995image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.397040image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.268491image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:51.128910image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:52.000056image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:53.053585image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:47.936096image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.746301image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.523731image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.387530image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:51.253739image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:52.124708image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:53.187791image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.062825image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.843795image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.649426image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.514476image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:51.378279image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:52.253362image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:53.322401image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.191177image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.945417image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.778501image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.636692image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:51.502610image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:52.378820image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:53.466874image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:48.333738image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.051934image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:49.913295image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:50.767182image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:51.638604image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-03-10T02:49:52.511390image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2023-03-10T02:50:08.496473image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Unnamed: 0INS_IDINS_PASOCOD_PROVINCIA_RESIDECOD_CANTON_RESIDECOD_PARROQUIA_RESIDEcod_finalUSU_ESTADO_CIVILINS_SEXOINS_ESTADOINS_TIPO_INSCRIPCIONPROVINCIA_RESIDEINS_AUTOIDENTIFICACIONarchivoPER_IDUSU_NACIONALIDADCARGA_ENCUESTAFINALIZA_INSCRIPCIONOTRA_DISCAPACIDADTITULO_HOMOLOGADOINTERNET_DOMICILIOCOMPUTADORA_DOMICILIO
Unnamed: 01.0000.0450.0060.7680.7610.7570.0250.0450.0220.0330.0000.4750.0820.1470.1660.0260.0760.0350.0000.0140.0350.063
INS_ID0.0451.000-0.030-0.016-0.016-0.0160.3460.0350.0750.0751.0000.1490.4491.0001.0000.0050.1280.1280.0020.0050.0720.090
INS_PASO0.006-0.0301.0000.0060.0060.0060.0380.0120.2281.0000.0000.0120.0180.0670.0730.0350.4240.6150.0000.0000.0000.000
COD_PROVINCIA_RESIDE0.768-0.0160.0061.0000.9920.9860.0120.0080.0060.0070.0001.0000.0850.0810.0760.0110.0720.0180.0000.0040.0220.047
COD_CANTON_RESIDE0.761-0.0160.0060.9921.0000.9940.0140.0080.0060.0070.0001.0000.0850.0810.0760.0110.0720.0180.0000.0040.0220.047
COD_PARROQUIA_RESIDE0.757-0.0160.0060.9860.9941.0000.0140.0080.0060.0070.0001.0000.0850.0810.0760.0110.0720.0180.0000.0040.0220.047
cod_final0.0250.3460.0380.0120.0140.0141.0000.0510.0340.0730.0000.0440.0280.0450.0490.0160.0740.0940.0020.0070.0240.047
USU_ESTADO_CIVIL0.0450.0350.0120.0080.0080.0080.0511.0000.0430.0150.0000.0240.0400.0540.0630.0160.0600.0350.0040.0210.0380.049
INS_SEXO0.0220.0750.2280.0060.0060.0060.0340.0431.0000.2680.0000.0240.0340.0780.0370.0160.0370.0140.0030.0040.0150.035
INS_ESTADO0.0330.0751.0000.0070.0070.0070.0730.0150.2681.0000.0000.0230.0320.0590.0640.0290.5990.6150.0000.0000.0000.000
INS_TIPO_INSCRIPCION0.0001.0000.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0260.0010.0001.0001.0001.0001.000
PROVINCIA_RESIDE0.4750.1490.0121.0001.0001.0000.0440.0240.0240.0230.0001.0000.1530.1870.1910.0120.1710.0390.0030.0130.1360.157
INS_AUTOIDENTIFICACION0.0820.4490.0180.0850.0850.0850.0280.0400.0340.0320.0000.1531.0000.5100.5790.0710.7070.0730.0010.0290.1220.096
archivo0.1471.0000.0670.0810.0810.0810.0450.0540.0780.0590.0000.1870.5101.0001.0000.1320.7080.0810.0000.0080.0580.067
PER_ID0.1661.0000.0730.0760.0760.0760.0490.0630.0370.0640.0000.1910.5791.0001.0000.1320.7080.0810.0000.0080.0580.067
USU_NACIONALIDAD0.0260.0050.0350.0110.0110.0110.0160.0160.0160.0290.0260.0120.0710.1320.1321.0000.1550.0350.0000.2970.0050.006
CARGA_ENCUESTA0.0760.1280.4240.0720.0720.0720.0740.0600.0370.5990.0010.1710.7070.7080.7080.1551.0001.0000.0010.0050.0270.087
FINALIZA_INSCRIPCION0.0350.1280.6150.0180.0180.0180.0940.0350.0140.6150.0000.0390.0730.0810.0810.0351.0001.0000.0010.0050.0270.087
OTRA_DISCAPACIDAD0.0000.0020.0000.0000.0000.0000.0020.0040.0030.0001.0000.0030.0010.0000.0000.0000.0010.0011.0000.0000.0010.002
TITULO_HOMOLOGADO0.0140.0050.0000.0040.0040.0040.0070.0210.0040.0001.0000.0130.0290.0080.0080.2970.0050.0050.0001.0000.0000.000
INTERNET_DOMICILIO0.0350.0720.0000.0220.0220.0220.0240.0380.0150.0001.0000.1360.1220.0580.0580.0050.0270.0270.0010.0001.0000.503
COMPUTADORA_DOMICILIO0.0630.0900.0000.0470.0470.0470.0470.0490.0350.0001.0000.1570.0960.0670.0670.0060.0870.0870.0020.0000.5031.000

Missing values

2023-03-10T02:49:54.326888image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-10T02:49:56.836539image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-03-10T02:50:04.617526image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0USU_FECHAREGISTROUSU_ESTADOUSU_ESTADO_CIVILUSU_FECHA_NACINS_SEXOINS_IDINS_FECHAINS_ESTADOINS_PASOINS_FECHA_ACONDICIONESINS_TIPO_INSCRIPCIONCOD_PAIS_RESIDEPAIS_RESCOD_PROVINCIA_RESIDEPROVINCIA_RESIDECOD_CANTON_RESIDECANTON_RESIDECOD_PARROQUIA_RESIDEPARROQUIA_RESIDEINS_AUTOIDENTIFICACIONENC_ETERMINADAENC_FECHA_UPLOAD_ENCUESTAENC_FECHA_IMPRIMEENC_FECHA_FINALIZA_INSCRIPCIONcod_finalarchivoPER_IDUSU_NACIONALIDADUSU_NACIONALIDAD_EXTRANJERAIAS_OTRA_DISCAPACIDADINS_TIPOCARGA_ENCUESTAFINALIZA_INSCRIPCIONOTRA_DISCAPACIDADTITULO_HOMOLOGADOINTERNET_DOMICILIOCOMPUTADORA_DOMICILIO
014/5/2018 13:37:49AS28/9/2000 0:00:00MUJER7433157.07/5/2019 19:29:18TERMINADO53/5/2019 18:07:2511.0ECUADOR5.0COTOPAXI501.0LATACUNGA50104.0LA MATRIZMESTIZOTERMINADA7/5/2019 0:00:007/5/2019 19:58:597/5/2019 19:58:551881330092insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1229/4/2019 20:05:58ANaNNaNSIN DATO7266969.029/4/2019 20:07:20INCOMPLETO029/4/2019 20:07:201NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2211600092insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
231/5/2019 18:29:55AS1/7/1984 0:00:00HOMBRE7369749.01/5/2019 19:28:25TERMINADO51/5/2019 18:33:5311.0ECUADOR17.0PICHINCHA1701.0DISTRITO METROPOLITANO DE QUITO170130.0SAN JUANMULATOTERMINADA2/5/2019 0:00:002/5/2019 13:56:262/5/2019 13:56:232238110001insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
344/5/2019 3:45:26AS15/8/2001 0:00:00MUJER7439944.04/5/2019 4:40:04TERMINADO54/5/2019 3:47:0611.0ECUADOR17.0PICHINCHA1701.0DISTRITO METROPOLITANO DE QUITO170155.0CALDERON (CARAPUNGO)MESTIZONaNNaNNaNNaN2264040047insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
453/5/2019 1:39:48AS11/8/2000 0:00:00MUJER7420125.03/5/2019 3:21:08TERMINADO53/5/2019 1:59:1411.0ECUADOR17.0PICHINCHA1701.0DISTRITO METROPOLITANO DE QUITO170101.0BELISARIO QUEVEDOOTROTERMINADA3/5/2019 0:00:004/5/2019 8:17:163/5/2019 4:44:452276920056insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
565/12/2018 15:55:35AS24/7/2000 0:00:00MUJER7250052.030/4/2019 0:30:19TERMINADO529/4/2019 14:59:4011.0ECUADOR3.0CAÑAR304.0LA TRONCAL30450.0LA TRONCALMESTIZOTERMINADA1/5/2019 0:00:001/5/2019 0:23:421/5/2019 0:22:452137980056insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
675/5/2019 20:53:48ANaNNaNSIN DATO7468669.05/5/2019 21:04:28INCOMPLETO05/5/2019 21:04:281NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2273730038insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
7830/4/2019 21:08:10AD6/12/1956 0:00:00HOMBRE7333209.030/4/2019 21:11:55TERMINADO530/4/2019 21:08:4311.0ECUADOR1.0AZUAY101.0CUENCA10156.0LLACAOMESTIZOTERMINADA1/5/2019 0:00:001/5/2019 9:01:301/5/2019 9:01:222200040165insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
894/12/2017 14:22:45AC18/4/1962 0:00:00HOMBRE7466321.05/5/2019 19:33:11TERMINADO55/5/2019 19:27:3311.0ECUADOR9.0GUAYAS901.0GUAYAQUIL90109.0ROCAMESTIZOTERMINADA5/5/2019 0:00:005/5/2019 20:08:375/5/2019 20:05:201807610192insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
91026/4/2019 5:06:02AC9/10/1959 0:00:00MUJER7095341.026/4/2019 5:14:49TERMINADO526/4/2019 5:07:2911.0ECUADOR1.0AZUAY101.0CUENCA10105.0EL VECINOMESTIZOTERMINADA27/4/2019 0:00:0027/4/2019 0:56:2627/4/2019 0:51:122162120101insritos_per18.csvNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Unnamed: 0USU_FECHAREGISTROUSU_ESTADOUSU_ESTADO_CIVILUSU_FECHA_NACINS_SEXOINS_IDINS_FECHAINS_ESTADOINS_PASOINS_FECHA_ACONDICIONESINS_TIPO_INSCRIPCIONCOD_PAIS_RESIDEPAIS_RESCOD_PROVINCIA_RESIDEPROVINCIA_RESIDECOD_CANTON_RESIDECANTON_RESIDECOD_PARROQUIA_RESIDEPARROQUIA_RESIDEINS_AUTOIDENTIFICACIONENC_ETERMINADAENC_FECHA_UPLOAD_ENCUESTAENC_FECHA_IMPRIMEENC_FECHA_FINALIZA_INSCRIPCIONcod_finalarchivoPER_IDUSU_NACIONALIDADUSU_NACIONALIDAD_EXTRANJERAIAS_OTRA_DISCAPACIDADINS_TIPOCARGA_ENCUESTAFINALIZA_INSCRIPCIONOTRA_DISCAPACIDADTITULO_HOMOLOGADOINTERNET_DOMICILIOCOMPUTADORA_DOMICILIO
139350021397044043.5603AS37075.0MUJERNaN44043.5681TERMINADO544043.568111.0ECUADOR21.0SUCUMBIOS2103.0PUTUMAYO210351.0PALMA ROJAMestizo/aTERMINADA44044.85855NaN44044.858552528966247insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONONOSINO
139350121397144039.82429AC36631.0HOMBRENaN44040.7675TERMINADO544040.767511.0ECUADOR4.0CARCHI401.0TULCAN40153.0JULIO ANDRADE (OREJUELA)Mestizo/aTERMINADA44040.79269NaN44040.792692494296247insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONONOSISI
139350221397236870.79167AC36870.0HOMBRENaN44040.11558TERMINADO544040.1155811.0ECUADOR9.0GUAYAS901.0GUAYAQUIL90112.0TARQUIMestizo/aTERMINADA44040.87287NaN44040.872872324086256insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONOSISISI
139350321397344040.77133AC36988.0HOMBRENaN44040.77256TERMINADO544040.7725611.0ECUADOR9.0GUAYAS901.0GUAYAQUIL90115.0PASCUALESMestizo/aTERMINADA44040.99043NaN44040.990432489286338insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONONOSISI
139350421397444043.6041AS36989.0MUJERNaN44043.61274TERMINADO544043.6127411.0ECUADOR23.0SANTO DOMINGO DE LOS TSACHILAS2301.0SANTO DOMINGO230102.0BOMBOLÍBlanco/aTERMINADA44043.72946NaN44043.729462529846301insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONONOSISI
139350521397544040.94576AC36715.0HOMBRENaN44040.94654TERMINADO544040.9465411.0ECUADOR6.0CHIMBORAZO601.0RIOBAMBA60101.0LIZARZABURUMestizo/aTERMINADA44041.84304NaN44041.843042519676356insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONOSISISI
139350621397644039.40438AC34878.0MUJERNaN44039.40559TERMINADO544039.4055911.0ECUADOR17.0PICHINCHA1701.0DISTRITO METROPOLITANO DE QUITO170119.0LA FERROVIARIAMestizo/aTERMINADA44042.51782NaN44042.517822491766310insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONONOSISI
139350721397744043.62436AS36884.0HOMBRENaN44043.62473TERMINADO544043.6247311.0ECUADOR13.0MANABI1312.0ROCAFUERTE131250.0ROCAFUERTEMestizo/aTERMINADA44043.71148NaN44043.711482531446347insritos_per20.csv20.0COLOMBIANA88.00.0ISIEFA Y REGISTRO COMPLETONONOSINO
139350821397844040.03186AC34534.0HOMBRENaN44040.05444TERMINADO544040.0544411.0ECUADOR21.0SUCUMBIOS2101.0LAGO AGRIO210150.0NUEVA LOJAMestizo/aTERMINADA44040.75642NaN44040.756422518783892insritos_per20.csv20.0CUBANA97.00.0ISIEFA Y REGISTRO COMPLETONOSISISI
139350921397944041.03975AC31825.0HOMBRENaN44041.0406TERMINADO544041.040611.0ECUADOR13.0MANABI1308.0MANTA130802.0MANTABlanco/aTERMINADA44041.09249NaN44041.092492501591956insritos_per20.csv20.0CHILENA85.00.0ISIEFA Y REGISTRO COMPLETONONOSISI